基于深入的学习的诊断性能随着更多的注释数据而增加,但手动注释是大多数领域的瓶颈。专家在临床常规期间评估诊断图像,并在报告中写出他们的调查结果。基于临床报告的自动注释可以克服手动标记瓶颈。我们假设可以使用这些报告的稀疏信息引导的模型预测来生成用于检测任务的密度注释。为了证明疗效,我们在放射学报告中临床显着发现的数量指导的临床上显着的前列腺癌(CSPCA)注释。我们包括7,756个前列腺MRI检查,其中3,050人被手动注释,4,706次自动注释。我们对手动注释的子集进行了自动注释质量:我们的得分提取正确地确定了99.3 \%$ 99.3 \%$ 99.3 \%$的CSPCA病变数量,我们的CSPCA分段模型正确地本地化了83.8 \ PM 1.1 \%$的病变。我们评估了来自外部中心的300名检查前列腺癌检测表现,具有组织病理学证实的基础事实。通过自动标记的考试增强培训集改善了在接收器的患者的诊断区域,从$ 88.1 \ pm 1.1 \%$至89.8 \ pm 1.0 \%$($ p = 1.2 \ cdot 10 ^ { - 4} $ )每案中的一个错误阳性的基于病变的敏感性,每案件从79.2美元2.8 \%$ 85.4 \ PM 1.9 \%$($ P <10 ^ { - 4} $),以$ alm \ pm std。$超过15个独立运行。这种改进的性能展示了我们报告引导的自动注释的可行性。源代码在https://github.com/diagnijmegen/report-guiding-annotation上公开可用。最佳的CSPCA检测算法在https://grand-challenge.org/algorithms/bpmri-cspca-detection-report-guiding-annotations/中提供。
translated by 谷歌翻译
Decentralized and federated learning algorithms face data heterogeneity as one of the biggest challenges, especially when users want to learn a specific task. Even when personalized headers are used concatenated to a shared network (PF-MTL), aggregating all the networks with a decentralized algorithm can result in performance degradation as a result of heterogeneity in the data. Our algorithm uses exchanged gradients to calculate the correlations among tasks automatically, and dynamically adjusts the communication graph to connect mutually beneficial tasks and isolate those that may negatively impact each other. This algorithm improves the learning performance and leads to faster convergence compared to the case where all clients are connected to each other regardless of their correlations. We conduct experiments on a synthetic Gaussian dataset and a large-scale celebrity attributes (CelebA) dataset. The experiment with the synthetic data illustrates that our proposed method is capable of detecting tasks that are positively and negatively correlated. Moreover, the results of the experiments with CelebA demonstrate that the proposed method may produce significantly faster training results than fully-connected networks.
translated by 谷歌翻译
Multi-task learning (MTL) is a learning paradigm to learn multiple related tasks simultaneously with a single shared network where each task has a distinct personalized header network for fine-tuning. MTL can be integrated into a federated learning (FL) setting if tasks are distributed across clients and clients have a single shared network, leading to personalized federated learning (PFL). To cope with statistical heterogeneity in the federated setting across clients which can significantly degrade the learning performance, we use a distributed dynamic weighting approach. To perform the communication between the remote parameter server (PS) and the clients efficiently over the noisy channel in a power and bandwidth-limited regime, we utilize over-the-air (OTA) aggregation and hierarchical federated learning (HFL). Thus, we propose hierarchical over-the-air (HOTA) PFL with a dynamic weighting strategy which we call HOTA-FedGradNorm. Our algorithm considers the channel conditions during the dynamic weight selection process. We conduct experiments on a wireless communication system dataset (RadComDynamic). The experimental results demonstrate that the training speed with HOTA-FedGradNorm is faster compared to the algorithms with a naive static equal weighting strategy. In addition, HOTA-FedGradNorm provides robustness against the negative channel effects by compensating for the channel conditions during the dynamic weight selection process.
translated by 谷歌翻译
The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenotyping, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.
translated by 谷歌翻译
Machine learning algorithms have revolutionized different fields, including natural language processing, computer vision, signal processing, and medical data processing. Despite the excellent capabilities of machine learning algorithms in various tasks and areas, the performance of these models mainly deteriorates when there is a shift in the test and training data distributions. This gap occurs due to the violation of the fundamental assumption that the training and test data are independent and identically distributed (i.i.d). In real-world scenarios where collecting data from all possible domains for training is costly and even impossible, the i.i.d assumption can hardly be satisfied. The problem is even more severe in the case of medical images and signals because it requires either expensive equipment or a meticulous experimentation setup to collect data, even for a single domain. Additionally, the decrease in performance may have severe consequences in the analysis of medical records. As a result of such problems, the ability to generalize and adapt under distribution shifts (domain generalization (DG) and domain adaptation (DA)) is essential for the analysis of medical data. This paper provides the first systematic review of DG and DA on functional brain signals to fill the gap of the absence of a comprehensive study in this era. We provide detailed explanations and categorizations of datasets, approaches, and architectures used in DG and DA on functional brain images. We further address the attention-worthy future tracks in this field.
translated by 谷歌翻译
智能仪表测量值虽然对于准确的需求预测至关重要,但仍面临一些缺点,包括消费者的隐私,数据泄露问题,仅举几例。最近的文献探索了联合学习(FL)作为一种有前途的隐私机器学习替代方案,该替代方案可以协作学习模型,而无需将私人原始数据暴露于短期负载预测中。尽管有着美德,但标准FL仍然容易受到棘手的网络威胁,称为拜占庭式攻击,这是由错误和/或恶意客户进行的。因此,为了提高联邦联邦短期负载预测对拜占庭威胁的鲁棒性,我们开发了一个最先进的基于私人安全的FL框架,以确保单个智能电表的数据的隐私,同时保护FL的安全性模型和架构。我们提出的框架利用了通过符号随机梯度下降(SignsGD)算法的梯度量化的想法,在本地模型培训后,客户仅将梯度的“符号”传输到控制中心。当我们通过涉及一组拜占庭攻击模型的基准神经网络的实验突出显示时,我们提出的方法会非常有效地减轻此类威胁,从而优于常规的FED-SGD模型。
translated by 谷歌翻译
随着智能设备的扩散和通信中的旋转,配电系统逐渐从被动,手动操作和不灵活的,到大规模互连的网络物理智能电网,以解决未来的能源挑战。然而,由于部署的大规模复杂性和资源限制,若干尖端技术的集成引入了几种安全和隐私漏洞。最近的研究趋势表明,虚假数据注入(FDI)攻击正成为整个智能电网范式内最恶毒的网络威胁之一。因此,本文介绍了对积极分配系统内的直接投资袭击事件的最近进展的全面调查,并提出了分类法,以对智能电网目标进行外商直接投资威胁。相关研究与攻击方法和对电力分配网络的影响形成鲜明对比和总结。最后,我们确定了一些研究差距并推荐了一些未来的研究方向,以指导和激励前瞻性研究人员。
translated by 谷歌翻译